Goto

Collaborating Authors

 shadow library


A Judge Says Meta's AI Copyright Case Is About 'the Next Taylor Swift'

WIRED

US District Court Judge Vince Chhabria spent several hours grilling lawyers from both sides after they each filed motions for partial summary judgment, meaning they want Chhabria to rule on specific issues of the case rather than leaving each one to be decided at trial. The authors allege that Meta illegally used their work to build its generative AI tools, emphasizing that the company pirated their books through "shadow libraries" like LibGen. Kadrey v. Meta is one of the dozens of lawsuits filed against AI companies that are winding through the US legal system. While the authors were heavily focused on the piracy element of the case, Chhabria spoke emphatically about his belief that the big question is whether Meta's AI tools will hurt book sales and otherwise cause the authors to lose money. "If you are dramatically changing, you might even say obliterating, the market for that person's work, and you're saying that you don't even have to pay a license to that person to use their work to create the product that's destroying the market for their work--I just don't understand how that can be fair use," he told Meta lawyer Kannon Shanmugam.


'Meta has stolen books': authors to protest in London against AI trained using 'shadow library'

The Guardian

Novelists Kate Mosse and Tracy Chevalier as well as poet and former Royal Society of Literature chair Daljit Nagra will be among those in attendance outside the company's King's Cross office. Protesters will meet at Granary Square at 1.30pm and a letter to Meta from the Society of Authors (SoA) will be hand-delivered at 1.45pm. It will also be sent to Meta headquarters in the US. Earlier this year, a US court filing alleged that Meta CEO Mark Zuckerberg approved the company's use of a notorious "shadow library", LibGen, which contains more than 7.5 million books. Last month, the Atlantic republished a searchable database of the titles contained in LibGen, through which many authors discovered their works may have been used to train Meta's AI models.

  libgen, meta, shadow library, (7 more...)
  Country: North America > United States (0.28)
  Industry:

Lawsuit says Mark Zuckerberg approved Meta's use of pirated materials to train Llama AI

Engadget

As TechCrunch reports, the plaintiffs of the Kadrey v. Meta case submitted court documents talking about the company's use of of the LibGen dataset for AI training. LibGen is generally described as a "shadow library" that provides file-sharing access to academic and general-interest books, journals, images and other materials. The counsel for the plaintiffs, which include writers Sarah Silverman and Ta-Nehisi Coates, accused Zuckerberg of approving the use of LibGen for training despite concerns raised by company executives and employees who described it as a "dataset [they] know to be pirated." In addition, the counsel mentioned that Meta admitted to torrenting LibGen materials, even though its engineers felt uneasy about sharing them "from a [Meta-owned] corporate laptop." They accused the companies of using pirated materials from shadow libraries to train their AI models.


The Battle Over Books3 Could Change AI Forever

WIRED

After OpenAI released GPT-3 in July 2020, independent artificial intelligence researcher Shawn Presser and a few of his fellow machine-learning enthusiasts set a challenge for themselves: Could they recreate it? "We were like, OK, there's actually not that much standing in the way of us doing this ourselves," Presser says. So what if OpenAI had deep pockets and a head start? That summer, they pored over papers about GPT-3, strategizing in marathon Discord chats about how to best approximate its training data sets. Presser honed in on the books they needed.


Sarah Silverman sues OpenAI and Meta for copyright infringement

The Guardian

Silverman has filed the suits along with two authors, Christopher Golden and Richard Kadrey, in which they claim the AI models developed by OpenAI and Meta used their work as part of their training data. Tools like ChatGPT, a highly popular chatbot, are based on large language models that are fed vast amounts of data taken from the internet in order to train them to give convincing responses to text prompts from users. The suits claim the authors' works were obtained from "shadow library" sites that have "long been of interest to the AI-training community". The OpenAI suit includes exhibits claiming that, when prompted, it summarised three books: Silverman's The Bedwetter, Ararat by Golden, and Kadrey's Sandman Slim. The Meta suit cites multiple works by Kadrey and Golden, alongside The Bedwetter, and flags a Meta paper that indicates LLaMA's training datasets included material taken from shadow libraries the suit describes as "flagrantly illegal".